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Abstract 

We introduce the startup energy of energy-recovery systems. We point 
out the inverse relationship between the startup energy and the dissipa- 
tion during continuous operation. We illustrate the relationship with two 
CMOS circuit examples. 

1 Introduction 

A digital electronic circuit carries out its operations by distributing energy 
among its circuit nodes according to rules laid down in the interconnection 
of its switching devices. The energy levels on its circuit nodes— its signal 
energies— change as its signals transition between the logic levels used to 
encode the information being processed. 

The energy dissipated as a residt of a signal transition is t in the widely 
used logic design styles, at least as big as the change in the signal energy. 
Moreover, this switching energy frequently dominates the total dissipation 
of a system designed for low power. Therefore, the two directions most 
commonly followed in low-power design are to minimize the number of 
switching events, and to reduce the signal energies. 

Signal energies may be reduced by decreasing the signal voltage swing 
[1], This method is widely used and quite effective. It yields predictable 
and intuitively reasonable results due to the strong binding of switching 
energy to signal energy. It can, however, not be extended without bounds: 
most available switching devices require a minimum voltage swing on the 
controlling electrode to function properly. Thus, the binding of switching 
energy to signal energy sets a lower limit on dissipation. 

Many methods and approaches have been proposed (several recently 
[2, 3, 4, 5]) to relax the relationship between switching energy and signal 
energy. The theme common to all these approaches is that signal energies 
inside the logic circuit are recovered rather than dissipated as heat. The 
"This research is sponsored by ARPA under contract DAAL01-95-K3528. 
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Figure 1: Conceptual conventional and energy-recovery computing systems. 
The arrows indicate energy flow. 



concept is illustrated in Figure 1. In the conventional system, energy is 
delivered directly and unidirectionally from a source (such as a battery} 
to the logic circuit, where it is dissipated. The situation is different in 
the energy-recovery system. Here, energy is delivered from the source to 
a recycling apparatus; the latter supplies energy to and recovers energy 
from the logic circuit. Thus, energy transfer is unidirectional from the 
source to the recycler, but bidirectional between the recycler and the logic 

Clr °The system organization shown in Figure 1 is conceptual only. In 
practical cases, it may be difficult to separate logic circuit from recycling 
apparatus, and recycler from source. We will discriminate source from re- 
cycler based on the direction of energy transport: a source delivers energy 
to the rest of the system, but never receives energy back. Furthermore, 
when we discuss the total energy present in an energy-recovery system, 
we exclude energy stored in sources; these are considered external to the 

system being studied. 

Clearly, the energy present in an energy-recovery computing system 
exceeds the energy dissipated per "operation" (otherwise, there wou d be 
no energy to recover upon completion). All this energy must initially be 
delivered to the system to make subsequent low-power operation possible. 
Therefore, energy-recovery computing systems require an initial energy 
investment, a "startup energy," to reach the steady state where compu- 
tation can be carried out efficiently. This startup energy has received 
considerably less attention than the energy dissipated during continuous 
operation. This situation is doubly unfortunate: first, for a system which 
does burst-like calculations and is frequently restarted, the startup energy 
may dominate the steady-state energy; and second, there is frequently an 
inverse relationship between the startup energy and the steady-state en- 
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Figure 2: A series LRC circuit models a resonant energy-recovery circuit. The 
logic circuit contributes the resistive and capacitive components, whereas the 
inductance represents the power supply. 

ergy, such that when a design parameter is adjusted to reduce the steady- 
state energy, the startup energy grows correspondingly. 

The designer of an energy-recovery computing system must therefore 
be aware not only of how to minimize the steady-state energy, but also of 
how to trade it ofT against the startup energy. These tradeoffs are steered 
by relations which many designers would consider less than intuitive. 

In this paper, we will discuss two examples of energy-recovery logic 
circuits— a resonant LRC circuit and a stepwise signal driver— and ex- 
plore the relationships between startup energy and steady-state energy. 
Both examples show an inverse relationship between stored energy and 
dissipated energy. In the stepwise-driver example, we also explicitly com- 
pute the total startup energy, which shows an even stronger dependence 
on the dissipated energy. In both cases, we assume a CMOS technology. 
MOS switching devices are assumed in most discussions about low-power 
logic circuits: the absence of recombination of controlling and controlled 
charges allows not only frugal operation, but also accurate book-keeping, 
which simplifies evaluation of proposed cimut schemes. 

2 Resonant LRC circuits 

Most proposals for energy-recovery CMOS systems assume that inductors 
are used for temporary energy storage. We follow the examples of Koller 
[61 and Younis [2] and model the computing system, including the power 
supply, as a series LRC circuit (Figure 2). The resistive and capacitive 
components of the LRC circuit represent channel resistances and gate and 
parasitic capacitances of the CMOS logic circuit. The details of the logic 
style and the method of energy replenishment arc invisible at this level ot 
abstraction. Also, most proposed energy-recovery circuit styles use several 
clock phases, which would correspond to several resonant circuits. Given 
that the resonance frequency, the Q value, and the voltage swing of all 
phase circuits are similar, they can be adequately represented by one LRC 
circuit consisting of the parallel connections of the resistive, capacitive, 
and inductive components of all the clock-phase circuits. 

The LRC circuit energy resonates between the capacitance and the in- 
ductance: when the inductance current is zero, the capacitance voltage is 
at its peak, and all the energy is stored in the capacitance; the mductance 
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Firure 3: The model of the modified system consists of two copies of the re- 
sistive and capacitive components of the original system connected in parallel; 
the inductance of the modified system is selected for identical computational 
throughput. 

current is at its peak when the capacitance is completely discharged, and 
all the energy is stored in the inductance. Thus, it is sufficient to know the 
voltage swing and the capacitance to determine the total energy present 

in the system. ■ , 

Consider a resonant computing system modelled by an- inductance la, 
a capacitance C u and a resistance «i, which perforins a computation ui 
time Ki • 7i (Figure 3). Ti is the clock period, and Ki is the number of 
clock cycles required to carry out the computation. Assume that Rx 
jtjcl, so that the equivalent LRC circuit is heavily undcrdamped. 
Then: 



The total energy present in the model system, E x , is determined by Ci 
and the voltage swing V: 

The energy dissipation per clock cycle is, to first order, inversely propor- 
tional to the Q value of the LRC circuit: 

1 [U 

The total energy dissipation during the computation is then: 
En.,1 « K X (const • J") = const ' K lEl Rl 



4 



'00 01/13 19:22 FAX 



Lg] UUb 



Next, we construct a modified system, characterized by the component 
r, C, and il 2 , which performs a computation in time K 3 ■ U. me 
m od£ed system carries out the calculations in the same total tune as 
^rigin^ system, Le., K.-T^K,- Ti; however it usesmcreascd 
pa^efism to complete the work in half as many cycles as the original 
svstcm i e... Ki = Ki/2. Clearly, T t = 2 • 2\. 

y Tr'simplVci ? ty, we assume that the computations may be parallelized 
without overhead, such that twice the amount of computing circuitry 
yields twice the number of useful operations per clock cycle. Then^ as 
flhistrated in Figure 3, C a = 2 • C, and R, = Ri/2. The required value 

for 1/2 is given by: 

ij "C a W 2C 1 \2*J 
Similarly, we can express the total energy, the Q-value, and the total 
energy dissipation for the modified system in terms of the corresponding 
entities for the original system: 



Eou.b « ^(«> BSt -§) ss T"( co,,8i '5^) 

= ~ ■ Edissi 

Thus, the total dissipation associated with the calculation has been re- 
duced by a factor of 2. The total energy present in the computing system, 
however, has increased by the same factor. 

The energy present in the resonant system during operation consti- 
tutes a lower limit on the energy injected into the system during startup. 
Some of the startup energy will not remain in the system, but rather be 
dissipated in the process of moving the system to its low-power state. 
This overhead depends highly on the exact circuit solution; no estimate 
is offered here. 

3 Stepwise driver circuits 

Unlike the LRC circuit, a stepwise driver does not require any additional 
circuits for startup. Therefore, its operation can be modelled more accu- 
rately than that for the resonant circuit. Furthermore, a single analysis 
yields both the startup cost and the cost per "operation". 

Consider a stepwise driver with JV steps, consisting of a capacity 
load Cl and JV — 1 tank capacitors with identical capacitance, Ct, as 
Ihown in Figuie 4. The tank capacitors play the role of the recycler a 
this configuration. 
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Figure 4: A stepwise signal driver uses a bank of tank capacitors for temporary 
energy storage. 

Let Vi be the voltage of the »th tank capacitor. The load is charged 
by briefly connecting Vi through V N in succession (by opening switch 0 
and closing switch 1, waiting for the load to reach V, , opening sw.tch 1 
and closing switch 2, etc.). Switch N is kept closed for as long as the load 
capacitance is to stay charged. The load is discharged by connecting it to 
VW_! through V lt and finally to ground by closing switch 0. 

We assume that the circuit components are linear and that each MOS 
switch is turned on for a duration sufficiently long enough to complete 
the charging/discharging of the load C L . We also as sume that the tank 
capacitance C T is much larger than the load capacitance C L . W.th these 
assumptions, the following result can be shown. 

Theorem 3.1 Assume, that a N stepwise driver circuit operates under the 
previously stated assumptions. Then, the tank capacitor voltages converge 
to the following values: 

V i = iy/ori = l,2„..,Ar-l 
N 

The convergence rate R is given by: 

R — max{|A m «»| , |A m i»|} 



Here, A m . x and A„,„ are given by: 

Am 
Xrr 

Proof: Seo Appendix A for details 



(v) 
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As for the resonant circuit, we can explore the relationships between 
startup energy and dissipated energy. For each "step' from Vi to V m , 
the dissipation is given by the transferred charge and the average voltage 

drop across the switch: 

- „ V V _ i n YL 

EdUs.step = QV = VLJj ' 2JV 2 L iV 2 

To charge C L all the way to the supply voltage V, N steps are used- The 
dissipation for a full transition from 0 to V is then: 

1 

E di .., N -,u r = N ■ E.u, = j Cl < x > 

This formulation for the energy dissipation ignores the energy needed 
to control the MOS switches. It also assumes that the tank capacitor volt- 
ages do not undergo any fluctuations during a complete charge-discharge 
cycle. If this assumption is removed, then equation (1) must be modified 
to include a correction term which reflects an increased dissipation on 
account of such fluctuations. 

The energy stored in the stepwise driver after convergence is: 



l (jV-l)(2JV-l) ^ (2) 
~~ 2 67V 

We now investigate how the energy dissipation Edi,.,N-*up and the 
stored energy E. torcd depend on the stepwise driver parameter N. 

First, consider the case when N is increasing, but Ct is constant. 
The stored energy E tlmi increases linearly with N. Since dissipation , is 
inversely proportional to N, E^N-.up decreases linearly w.th Jv. This 
behavior is analogous to that exhibited by the LRC resonant circuit. 

Nrxt, consider the case when JV is increasing, but the sum of all the 
tank capacitances, C tola i, remains constant. In this case, the tank capac- 
itance decreases with JV: 

Ct = (3) 
° T (JV-1) 

The skored-energy equation (2) can be rewritten as: 

1 (2J V - 1) n V 2 m 

Hence, from equation (4) it follows that the stored energy remains con- 
stant (for large JV) despite an increasing JV. However, the d^tion 
enerey falls with N. Note that N cannot be increased arbitrarily beyond 
a certain limit to reap the benefit from a diminishing dissipation. In- 
creasing N brings a reduction in the per-step tank capacitance Ct given 
by (3). Therefore, there exists a limit for JV at which the per-step tank 
capacitance Ct value becomes comparable to the load capacitance C L . 
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At this point, the assumption made regarding the ^ «^ 

gence no longer holds. No claims can be made regarding the convergence 
in this case, since the mathematical model is not representaUve of the 
actual circuit behavior. . , 

We can also derive an approximate expression lor the energy required 
to bring the stepwtee-charging system to a state where all the tank capac- 
itor voltages have converged to their steady-state values: 

■ ., AT(AT-1) , (5) 

ti4tartup ~ ~j WJ 

See Appendix B for derivation of this formula. Interestingly, this expres- 
sion grows as N\ whereas the stored energy grows as AT. In this system, a 
dissipation reduction therefore requires a linear increase in system energy, 
but a quadratic increase in the startup energy. 

The dependence can be brought back to linear by allowing more than 
one parameter to vary. With a constant total tank capacitance, such 
that Ct = CW(JV - 1). startup energy is linear in N. However our 
derivations are not valid unless Cl<Ct,so such a system a not scalable 

to large iV. , . 

We finally observe that for large N, the ratio of the stored energy to 
the startup energy is proportional to that of the dissipated energy and 
the signal energy Eo = ClV~: 

E.tond Edi„, N— Up 

— ~ — " 

E.tartup Eo 

This expression illustrates the problem of optimizing the efficiency of both 
the initialization and the operation of the stepwise driver. 

4 Conclusion 

We have examined two different energy-recovery systems, using very dis- 
similar approaches. In both cases, the energy stored in the system in- 
creases linearly with decreasing per-operation dissipation. This similarity 
is remarkable in view of the different approaches: in one case, we varied 
the amount of computational parallelism of the system, whereas m the 
other case, we changed the granularity of the charging. 

If the energy stored in the system is discarded, i.e. dissipated, when the 
computations are completed, the per-operation dissipation cannot usefully 
be reduced below the point where the system energy dominates the overall 
dissipation. In such cases, the overall energy * 
allowing the per-operation dissipation to increase. This tradeoff is most 
important for systems which do only brief computations separated by long 

Per SnfeSruS tradeoffs such as this one also occur elsewhere in 
the design of energy-recovery circuits. A prime example is the J ^oiceof 
voltage swing in transmission-gate-based logic: reducing the voltage .swing 
below 4V th results in increased dissipation, even though the signal energy 

is reduced [4]. 
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In summary, energy-recovery circuits display complex relationships 
and tradeoff between signal energy, switching energy, system energy, and 
startup energy. Of these, the signal energy ^. Swltch,n tT^rlrtant 
ceived most attention, but the other relationships may be as unportant 
for the design of truly efficient energy-recovery systems. 
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A Convergence rate calculation 

Since the tank capacitance, C T , is assumed to be much larger ' ^n the 
Toad capacitance Cu, the voltage on each tank capac.tor c*. ^med 
to remain constant during a charge-discharge cycle. Hence, the charge 
pulled from tank capacitor i during charging in cycle t is: 

9i , up (t) = C t (V i (*)-V i - 1 (t))- 
The charge pushed back into tank capacitor » during discharging is: 
<7 i . <i »»»(t) = Cr.(K +1 (0-^W)- 
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Hence, the net increase on tank capacitor i over the charge-discharge cycle 

t is: 

qi(l) = «<,d»«»(t) - 

= Cl (K+l(0 + Vi-l(t) - 2Vi(t)) . 

The corresponding voltage change on tank capacitor i is: 



AVi(i) = ^ (V <+ ,(*) + Vi-i (« - 2V,(i)) - 



(6) 



(7) 



Thus, the voltage after the charge-discharge cycle is: 

V5(* + l)*V5(t) + AW(*). 

We introdnce the voltage deviance*, Vi, t= 1,2,... ,N-1, as > the ^differ- 
ences between the actual tank voltages and the voltages g.ven by the even 
distribution: 

Fox notational convenience, let W) = Vs(t) = 0 Vi. Rewrite equation (6) 

in terms of the deviances: 

AV^t) = MM 

= £L (v^t) + Vi-i{t) - 2V5(t)) ( Q ) 

Then, combining equations (7) - (9), we obtain a linear system of equa, 
tions: 

V(t + 1) = G-V(t) 
The variables are denned as follows: 

. V( t + l)= [Vi(t + 1), Vi(« + !). • •■ ,Vj»-i(t + l)J 



(10) 
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(11) 



where K = It is possible to show that the following inequality holds: 

||Vi(« + i)ll<ii-l|Vi(*)ll 

R is related to the maximum and the minimum eigenvalues of the matrix 
G defined in (11): 

R = max{|X m oz| , |Amin|} 
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A— = 1-2^ + 2— cos(^) 

A mm Ct &t 

R represents the rale of evolution of the error vector must be 

constrained to be less than unity for convergence. Tlus * possible tf and 



only if 



x-»g-'g~(*)>- 



from which it follows that 

<*> [! + «.(£)] ft. («> 

w sufficient for convergence. . . 

Tmust be noted that the lower bound for the tank capactanco ^ven 
by (12) is sufficient but not necessary for convergence. If Ct » ess than 
this bound, nothing can be said of convergence smce the vahdity of the 
mathematical model holds only when C T is much larger than ft.. 

B Startup energy 

The energy invested in the stepwise driver to bring the tank voltages to 
c^gc* injected From the power supply at voltage V. Assume that aU 
cfpacitor voltages are initial.y 0. In each ^discharge cycle, the 
amount of charge injected is given by <?<„, - (V -/ N -^t' v . the larger 
When the tank voltages have converged, V - V N -i = yV, tae larger 
voltage difference during startup cause auxiliary charge to bo drawn from 
the supply- The total auxiliary charge delivered by the supply, Q«.h v , 
can be approximated by: 

Qstartup — 

£[(^)-(>-»o( i Tr)K <13) 

£T X Z ^ol^fsZLV^^ of the matrix defined in 
(U) w h ft aJlied to be at least twice as large as ft. 
or the choice of A is that the speed of convergence is influenced ^by the 
slowest eigenvalue. Although use of this eigenvalue m equation 13) leads 
Ja somewhat conservative result, it is adequate for our analyse, and 
furthermore, it simplifies the formulation considerably. 
Simplifying equation (13), we get: 

N-l 1 . c y 

Qttartup — N ' I — X 

= N ~ l T .c<y 

„ W-l). CrV 
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for sufficiently large N 
So, 

E startup = Q startup * V 

tt 2 
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